Skip to content

Add FFT support via AbstractFFTs interface#713

Open
KaanKesginLW wants to merge 43 commits into
JuliaGPU:mainfrom
KaanKesginLW:feature/fft-support
Open

Add FFT support via AbstractFFTs interface#713
KaanKesginLW wants to merge 43 commits into
JuliaGPU:mainfrom
KaanKesginLW:feature/fft-support

Conversation

@KaanKesginLW
Copy link
Copy Markdown
Contributor

@KaanKesginLW KaanKesginLW commented Dec 3, 2025

Adds FFT support for MtlArray via the AbstractFFTs.jl interface.

HEAVILY based on CUDA.jl's AbstractFFTs.jl interface implementation using MPSGraph functionality.

using Metal

x = MtlArray(randn(ComplexF32, 2048, 2048))
y = fft(x)  # Just works!

Performance

Benchmarked on Apple M2 Max with 30-core GPU against FFTW.jl on CPU:

Size CPU (FFTW) GPU (Metal)
512×512 4.1ms 5.3ms
1024×1024 19.7ms 8.5ms
2048×2048 119.7ms 10.5ms
4096×4096 460.6ms 15.8ms

Example Usage

using Metal

# Complex FFT
x = MtlArray(randn(ComplexF32, 1024, 1024))
y = fft(x)
z = ifft(y)  # z ≈ x

# Real FFT  
r = MtlArray(randn(Float32, 1024, 1024))
c = rfft(r)           # Real → Complex
r2 = irfft(c, 1024)   # Complex → Real, r2 ≈ r

# FFT along specific dimensions
y = fft(x, 1)         # First dimension only
y = fft(x, (1, 2))    # Batched transform

# Plan reuse
x = MtlArray(randn(ComplexF32, 1024, 1024))
another_x = MtlArray(randn(ComplexF32, 1024, 1024))
p = plan_fft(x)
y1 = p * x
y2 = p * another_x    # Same plan, different data

Close #270

@KaanKesginLW KaanKesginLW mentioned this pull request Dec 3, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 80.98592% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.78%. Comparing base (706b87f) to head (d63bf16).

Files with missing lines Patch % Lines
src/fft.jl 78.74% 27 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main     #713    +/-   ##
========================================
  Coverage   80.77%   80.78%            
========================================
  Files          61       63     +2     
  Lines        2866     3008   +142     
========================================
+ Hits         2315     2430   +115     
- Misses        551      578    +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Details
Benchmark suite Current: d63bf16 Previous: 706b87f Ratio
array/accumulate/Float32/1d 1128208.5 ns 1111917 ns 1.01
array/accumulate/Float32/dims=1 1553459 ns 1537333 ns 1.01
array/accumulate/Float32/dims=1L 9811959 ns 9801916.5 ns 1.00
array/accumulate/Float32/dims=2 1861625 ns 1836250 ns 1.01
array/accumulate/Float32/dims=2L 7211937.5 ns 7183625 ns 1.00
array/accumulate/Int64/1d 1252666 ns 1241334 ns 1.01
array/accumulate/Int64/dims=1 1812667 ns 1812541 ns 1.00
array/accumulate/Int64/dims=1L 11791646 ns 11645625 ns 1.01
array/accumulate/Int64/dims=2 2152708 ns 2147520.5 ns 1.00
array/accumulate/Int64/dims=2L 9746042 ns 9754042 ns 1.00
array/broadcast 1017500 ns 1010999.5 ns 1.01
array/construct 5625 ns 5708 ns 0.99
array/permutedims/2d 1152354.5 ns 1148187.5 ns 1.00
array/permutedims/3d 1656583 ns 1656083 ns 1.00
array/permutedims/4d 2628000 ns 2633729 ns 1.00
array/private/copy 552833 ns 543083 ns 1.02
array/private/copyto!/cpu_to_gpu 783084 ns 780167 ns 1.00
array/private/copyto!/gpu_to_cpu 784042 ns 778333 ns 1.01
array/private/copyto!/gpu_to_gpu 609375 ns 611292 ns 1.00
array/private/iteration/findall/bool 1429958 ns 1320854 ns 1.08
array/private/iteration/findall/int 1597208 ns 1546271 ns 1.03
array/private/iteration/findfirst/bool 1972875 ns 1983083 ns 0.99
array/private/iteration/findfirst/int 2033083 ns 2003667 ns 1.01
array/private/iteration/findmin/1d 2247834 ns 2254458 ns 1.00
array/private/iteration/findmin/2d 2018270.5 ns 2007458 ns 1.01
array/private/iteration/logical 2519167 ns 2527292 ns 1.00
array/private/iteration/scalar 5097708 ns 5570229.5 ns 0.92
array/random/rand/Float32 1165709 ns 1109542 ns 1.05
array/random/rand/Int64 1313458 ns 1291000 ns 1.02
array/random/rand!/Float32 967208 ns 940625 ns 1.03
array/random/rand!/Int64 895292 ns 877666.5 ns 1.02
array/random/randn/Float32 1072021 ns 1067875 ns 1.00
array/random/randn!/Float32 842104 ns 835541.5 ns 1.01
array/reductions/mapreduce/Float32/1d 1114666 ns 1124229.5 ns 0.99
array/reductions/mapreduce/Float32/dims=1 827437.5 ns 832167 ns 0.99
array/reductions/mapreduce/Float32/dims=1L 1336417 ns 1340166.5 ns 1.00
array/reductions/mapreduce/Float32/dims=2 843625 ns 851250 ns 0.99
array/reductions/mapreduce/Float32/dims=2L 1772458 ns 1763500 ns 1.01
array/reductions/mapreduce/Int64/1d 1544104 ns 1551104 ns 1.00
array/reductions/mapreduce/Int64/dims=1 1130041 ns 1126666 ns 1.00
array/reductions/mapreduce/Int64/dims=1L 2033749.5 ns 2021125 ns 1.01
array/reductions/mapreduce/Int64/dims=2 1354583 ns 1269083 ns 1.07
array/reductions/mapreduce/Int64/dims=2L 3581208.5 ns 3587375 ns 1.00
array/reductions/reduce/Float32/1d 1023291.5 ns 1048583.5 ns 0.98
array/reductions/reduce/Float32/dims=1 828167 ns 835563 ns 0.99
array/reductions/reduce/Float32/dims=1L 1346125 ns 1342834 ns 1.00
array/reductions/reduce/Float32/dims=2 864750 ns 854709 ns 1.01
array/reductions/reduce/Float32/dims=2L 1769458 ns 1793437.5 ns 0.99
array/reductions/reduce/Int64/1d 1517666 ns 1509583 ns 1.01
array/reductions/reduce/Int64/dims=1 1116584 ns 1109041 ns 1.01
array/reductions/reduce/Int64/dims=1L 2013000 ns 2022959 ns 1.00
array/reductions/reduce/Int64/dims=2 1158979.5 ns 1149458 ns 1.01
array/reductions/reduce/Int64/dims=2L 4162375 ns 4176771 ns 1.00
array/shared/copy 241125 ns 239875 ns 1.01
array/shared/copyto!/cpu_to_gpu 75812.5 ns 78959 ns 0.96
array/shared/copyto!/gpu_to_cpu 80625 ns 80041 ns 1.01
array/shared/copyto!/gpu_to_gpu 80000 ns 80500 ns 0.99
array/shared/iteration/findall/bool 1447084 ns 1443687.5 ns 1.00
array/shared/iteration/findall/int 1541604.5 ns 1597292 ns 0.97
array/shared/iteration/findfirst/bool 1563542 ns 1560312.5 ns 1.00
array/shared/iteration/findfirst/int 1579896 ns 1587125 ns 1.00
array/shared/iteration/findmin/1d 1845562.5 ns 1847104 ns 1.00
array/shared/iteration/findmin/2d 2020209 ns 2008542 ns 1.01
array/shared/iteration/logical 2409604.5 ns 2411709 ns 1.00
array/shared/iteration/scalar 187125 ns 186000 ns 1.01
integration/byval/reference 1567334 ns 1561833 ns 1.00
integration/byval/slices=1 1555208 ns 1557292 ns 1.00
integration/byval/slices=2 2611333 ns 2611166.5 ns 1.00
integration/byval/slices=3 8553917 ns 7720208.5 ns 1.11
integration/metaldevrt 870958 ns 861375 ns 1.01
kernel/indexing 624875 ns 637542 ns 0.98
kernel/indexing_checked 614208.5 ns 659708 ns 0.93
kernel/launch 11375 ns 11333 ns 1.00
kernel/rand 586458 ns 579167 ns 1.01
latency/import 1380589000 ns 1375283937.5 ns 1.00
latency/precompile 28998263750 ns 28780620500 ns 1.01
latency/ttfp 1647912583 ns 1640608042 ns 1.00
metal/synchronization/context 19541 ns 19084 ns 1.02
metal/synchronization/stream 18208 ns 17917 ns 1.02

This comment was automatically generated by workflow using github-action-benchmark.

@christiangnrd

This comment was marked as outdated.

@KaanKesginLW

This comment was marked as outdated.

@github-actions

This comment was marked as spam.

Comment thread .gitignore Outdated
Comment thread lib/mpsgraphs/operations.jl
@KaanKesginLW

This comment was marked as resolved.

@christiangnrd christiangnrd force-pushed the feature/fft-support branch 2 times, most recently from 130ed6a to e3aeeea Compare January 19, 2026 01:09
christiangnrd

This comment was marked as outdated.

@liuyxpp

This comment was marked as off-topic.

@bjarthur

This comment was marked as resolved.

@christiangnrd christiangnrd marked this pull request as draft February 2, 2026 01:36
@christiangnrd christiangnrd force-pushed the feature/fft-support branch 2 times, most recently from e8d6b2c to ffdffe8 Compare February 4, 2026 02:59
@christiangnrd christiangnrd marked this pull request as ready for review February 4, 2026 03:31
@gbene gbene mentioned this pull request Feb 11, 2026
4 tasks
@christiangnrd christiangnrd force-pushed the feature/fft-support branch 3 times, most recently from 3c46dc2 to 60077f5 Compare February 19, 2026 16:38
Comment thread src/fft.jl
@christiangnrd christiangnrd force-pushed the feature/fft-support branch from 3d73d20 to d63bf16 Compare May 19, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FFT support

9 participants